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Abstract 

Background: Dominance effect nnay play an innportant role in genetic variation of complex traits. Full featured and 
easy-to-use computing tools for genomic prediction and variance component estimation of additive and dominance 
effects using genome-wide single nucleotide polymorphism (SNP) markers are necessary to understand dominance 
contribution to a complex trait and to utilize dominance for selecting individuals with favorable genetic potential. 

Results: The GVCBLUP package is a shared memory parallel computing tool for genomic prediction and variance 
component estimation of additive and dominance effects using genome-wide SNP markers. This package currently has 
three main programs (GRE)\/1L_CE, GRE1\/1L_Q)\/1, and GCORRMX) and a graphical user interface (GUI) that integrates the 
three main programs with an existing program for the graphical viewing of SNP additive and dominance effects 
(GVCeasy). The GRE1\/1L_CE and GRE)\/1L_Q)\/1 programs offer complementary computing advantages with identical 
results for genomic prediction of breeding values, dominance deviations and genotypic values, and for genomic 
estimation of additive and dominance variances and heritabilities using a combination of expectation-maximization 
(EM) algorithm and average information restricted maximum likelihood (AI-REML) algorithm. GREML_CE is designed for 
large numbers of SNP markers and GRE1\/1L_Q1\/1 for large numbers of individuals. Test results showed that GRE)\/1L_CE 
could analyze 50,000 individuals with 400 K SNP markers and GREIVIL.QIVI could analyze 100,000 individuals with 50K 
SNP markers. GCORRMX calculates genomic additive and dominance relationship matrices using SNP markers. GVCeasy 
is the GUI for GVCBLUP integrated with an existing software tool for the graphical viewing of SNP effects and a function 
for editing the parameter files for the three main programs. 

Conclusion: The GVCBLUP package is a powerful and versatile computing tool for assessing the type and magnitude 
of genetic effects affecting a phenotype by estimating whole-genome additive and dominance heritabilities, for 
genomic prediction of breeding values, dominance deviations and genotypic values, for calculating genomic 
relationships, and for research and education in genomic prediction and estimation. 
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Background 

Genomic prediction using genome-wide single nucleotide 
polymorphism (SNP) has become a powerful approach to 
capture genetic effects dispersed over the genome for pre- 
dicting an individuals genetic potential of a phenotype 
[1-3]. Genomic estimation of variance components using 



^ Correspondence: yda@umn.edu 

^Department of Animal Science, University of Minnesota, Saint Paul 
MN 55108, USA 

Full list of author information is available at the end of the article 

(3 BioMed Central 



genome-wide SNP markers is a powerful tool for estimat- 
ing the genetic contribution of the whole-genome to a 
phenotype and for addressing the missing heritability 
problem where a large number of causal variants ex- 
plained only a small fraction of the phenotypic variation. 
Dominance effects of quantitative traits are measured as 
the deviation of the mean value of the heterozygote geno- 
type of individuals from the averages of the two alternative 
homozygous genotypes [4,5]. The inclusion of dominance 
in the prediction model may improve the accuracy of 
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genomic prediction when dominance effects are present 
[6-9]. However, currently available software packages for 
genomic prediction and variance component estimation 
either are designed for additive effects only (GCTA [10]), 
or require users to prepare a dominance-specific file to es- 
timate dominance effects (BLR or BGLR [11], GenSel [12], 
DMU [13], BLUPF90 [14]). User-friendliness of the com- 
puting tool affects the efficiency of data analysis for gen- 
omic prediction and estimation. In order to fill these gaps, 
we implement two computationally complementary com- 
puting strategies with identical results and various defini- 
tions of genomic relationships in the GVCBLUP package 
that has a wide-range of flexibility and functionality for 
broad applicability of genomic prediction and estimation 
of additive and dominance effects. 

Implementation 

GVCBLUP currently has three main programs and a 
graphical user interface (GUI) named GVCeasy that in- 
tegrates the three main programs with an existing pro- 
gram for graphical viewing of SNP effects. The three 
main programs are GREML_CE, GREML_QM, and 
GCORRMX, which are developed using shared memory 
parallel computing technology. GVCeasy supplies users 
a user-friendly platform to run GVCBLUP. 

Two complementary computing strategies 

Two sets of formulations with complementary comput- 
ing advantages and identical results based on two 
equivalent mixed models are implemented: the CE set 
for large numbers of SNP markers and the QM set for 
large numbers of individuals [5,15]. Using notations in 
[5], the mixed model and its variance-covariance matrix 
for the CE set of formulations are: 

y = Xb + ZTatt + 2X56 + e = Xb + Za + Zd + e 

(1) 

Var(y) = V = ZAgZ' + ZDgZ' + (2) 

where X = N x c model matrix for fixed non-genetic ef- 
fects, b = c X 1 column vector of fixed effects, Z = N x q 
model matrix allocating phenotypic observations to SNP 
marker genotypes of individuals, T^c = q x m normalized 
model matrix for gene substitution effects of SNP 
markers, a = m x 1 column vector of gene substitution 
effects of SNP markers, T5 = q x m normalized model 
matrix for dominance effects of SNP markers, 5 = m x 
1 column vector of dominance effects of SNP markers, 
a = Tc^a = q X 1 genomic breeding values, d = T56 = 
q X 1 genomic dominance deviations, Ag = q x q gen- 
omic additive relationship matrix = T^aTa > Dg = q x q 
genomic dominance relationship matrix = T§T§ ', and 
a^, ag and are additive, dominance and residual 



variances, respectively. The mixed model and its 
variance-covariance matrix for the QM set of formula- 
tions are: 

y = Xb + Zia + Z26 + e (3) 

Var(y) = V = ZiZ/ + Z2Z2' ol + (4) 

where Zi = ZT^ and Z2 = ZT5. Computing difficulty is 
the V"^ and P = V"^ - V-^X(rV"^X)-rV"^ for the CE 
set of Equations 1-2 and is the inverse of the coefficient 
matrix of the mixed model equations after absorbing 
fixed non-genetic effects (to be denoted by C"^) for the 
QM set of Equations 3-4. The CE set has the best po- 
tential for using large numbers of SNP markers because 
the size of the V"^ and P matrices is determined by the 
number of individuals (assuming one observation per in- 
dividual) and does not change for different numbers of 
SNPs. Similarly, the QM set has the best potential for 
using large numbers of individuals because the size of 
the C"^ matrix is determined by the number of SNP 
markers and does not change for different numbers of 
individuals. 

EM-REML and AI-REML 

Two algorithms for restricted maximum likelihood 
(REML) estimation of variance components are imple- 
mented in both GREML_CE and GREML_QM: EM type 
algorithm (EM-REML) and AI-REML algorithm [5]. AI- 
REML generally is much faster than EM-REML but is 
not as robust as EM-REML and may be sensitive to ini- 
tial values of variance components in the iterations. We 
require at least two iterations of EM-REML and the user 
may specify a larger number of EM-REML iterations to 
produce better initial values of variance components 
than the user provided initial values before switching to 
AI-REML. When AI-REML yields a negative estimate 
for any of the variance component estimates, the program 
automatically returns to EM-REML, which yields non- 
negative estimates of variance components. This strategy 
is designed to guarantee GREML_CE and GREML_QM 
estimates of variance components to be positive. 

Shared memory parallel computing 

GVCBLUP is programmed in C++ language using Eigen 
[16] and Intel Math Kernel libraries (MKL) [17]. Eigen is 
a C++ template library for linear algebra, supports large 
dense and sparse matrices and supplies easy-to-use cod- 
ing expression for linear algebra. Intel MKL provides 
BLAS and LAPACK linear algebra routines and is opti- 
mized for Intel processors with multiple cores by using 
shared memory parallel computing technology, which is 
used for dense matrix inversion including V"^ and C"^ 
as well as dense matrix multiplications involving those 
two matrices in GVCBLUP. 
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Calculation and graphical viewing of SNP effects and 
heritabilities 

Both GREML_CE or GREML_QM can output additive 
and dominance marker effects as well as additive and 
dominance marker heritabilities for every SNP. SNP 
additive and dominance effects for GREML_CE are cal- 
culated at the last GREML iteration using the following 
formulations: 

d = a^T«Z'Py (5) 
6 = a^T5Z'Py (6) 

where d = GBLUP of SNP average effects of gene sub- 
stitution, 6 = GBLUP of SNP dominance effects, P = 
y-^ - y-^X{Xy-^X)-X\-\ and where V is defined by 
Equation 2. SNP effects for GREML_QM are obtained 
directly from the mixed model equations for the QM 
model (Equation 19 in [5]). According to the EM-REML 
formulation of additive or dominance variance compo- 
nent [5], we calculate the variance of each SNP marker 
as the marker contribution to the whole-genome SNP 
variance defined by its EM-REML formula. Let a^^ = 
additive variance of the ith SNP, and Og- = dominance 
variance of the ith SNP. Then, for GREML_CE, additive 
and dominance variances of the ith SNP are calculated 
as: 

al = df/tr(p«ZA,Z'), 4 = 6f /tr(p«ZD,Z') , 
and for GREML_QM, 

al = df/[m-tr(C™)X„], al = 8^ /[m-tr{c'')\,], 

where di = additive GBLUP of the ith SNP, 6i = dominance 
GBLUP of the ith SNP, r = rank of the coefficient matrix of 
the mixed model equations, \a = 0^/0^, Xg = a^/og, 
e = y-Xb-Zid-Z26 , and C"" and C^^ are defined by 
Equation 22 in [5]. For the ith SNP marker, additive herit- 
ability or heritability in the narrow sense (h^-), dominance 
heritability (hg-) and the total heritability or heritability in 



the broad sense (H^) are: 

K.=ol/o',= {a^/j:Z^^K (7) 

H. = oWy={8VT.7Jf)H (8) 

= K + hi (9) 



where = -\- -\- = phenotypic variance, h^ = 

total additive heritability of all SNP markers, and hg = 
total dominance heritability of all SNP markers. The out- 
put file for the SNP effects and heritabilities of Equations 
5-9 is designed such that the SNP effects and heritability 



estimates can be directly used as the input file for graph- 
ing and graphical viewing by SNPEVG2 [18]. 

Simulated test data 

Two simulated datasets are supplied in GVCBLUP pack- 
age for testing purpose. One data set (dataset_l) has 1000 
genotyped individuals with 3000 SNP markers and the 
other (dataset_2) has 3000 genotyped individuals with 
1000 SNP markers. The parameter files to run GVCBLUP 
programs for the simulated datasets are also included in 
the package. These simulated data are designed for 
GVCBLUP exercises and for showing the complemen- 
tary advantages of the CE and QM sets of formulations. 
Users interested in GVCBLUP exercises using large 
datasets could use a publically available swine dataset 
with over 45,000 SNP markers on 3534 individuals [19] 
that was used for comparing GREML estimates by 
GVCBLUP with the corresponding REML estimates using 
pedigree relations [5]. 

Results and discussion 

The structure of the GVCBLUP package with three main 
programs of GREML_CE, GREML_QM and GCORRMX 
is shown in Figure 1, and details of each program are de- 
scribed below. 

GREML_CE and GREML_QM programs 

The GREML_CE and GREML_QM programs calculate 
GREML estimates of additive, dominance and residual 
variances, additive and dominance heritabilities, as well 
as heritability in the broad sense as the summation of 
the additive and dominance heritabilities. GBLUP and 
reliability of breeding value, dominance deviation and 
genotypic value (summation of breeding value and 
dominance deviation) of each individual in the training 
or validation population are calculated at the end of 
variance component estimation. GREML_CE and 
GREML_QM offer complementary computing advan- 
tages with identical GREML and GBLUP results: 
GREML_CE for large numbers of SNP markers and 
GREML_QM for large numbers of individuals. Assuming 
one observation per individuals, GREML_CE is more effi- 
cient than GREML_QM if 2 m > q and is less efficient than 
GREML_QM if q > 2 m, where q = number of individuals 
and m = number of SNP markers. The example in Table 1 
shows the complementary computing advantages of 
GREML_CE and GREML_QM. Both programs produced 
identical results (Additional file 1: Supplementary output 
file) and required the same numbers of iterations (Table 1). 
For 1000 individuals and 3000 SNP markers, GREML_CE 
required 5 seconds and GREML_QM required 69 seconds, 
whereas for 3000 individuals and 1000 SNP markers, 
GREML_CE required 32 seconds and GREML_QM re- 
quired 6 seconds (Table 1). Given q = 2 m, the required 
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1 





CE implementation 
Designed for 2m>q 
Applicable to singular 
relationship matrices 



MME implementation 
Designed for q>2m 
Applicable to singular 
relationship matrices 



GCORRMX 

• Genomic relationships 

• Genomic correlations 



1) 



2) 



I 



Estimates of genomic additive, dominance 
and residual variances; genomic additive 
and dominance heritabilities, heritability in 
the broad sense (GREML) 

Genomic prediction (GBLUP) and 
reliabilities of genomic breeding values, 
dominance deviation sand genetic values 
for training and validation populations 

Additive and dominance effects and ) 
heritabilities for every SNP marker J 



SNPEVG2 
* Manhattan plot for all chromosomes 
* Graph and graphical view by chromosome 



Definitions of 
genomic 
additive and 
dominance 
relationships in 
literature 

Genomic 
additive and 
dominance 
correlation 

Selected or all 
pairs of 
individuals 



Figure 1 Structure of the GVCBLUP package, (m = number of SNP markers, q = number of individuals). 



memory storage of GREML_QM is approximately 1.5 
times larger than GREML_CE, but GREML_QM is faster 
than GREML_CE due to the fact that GREML_CE re- 
quires twice as many matrix multiplication between large 
dense matrices. The shared memory parallel computing of 
GREML_CE and GREML_QM achieved excellent scal- 
ability on ItascaSB cluster with two eight-core Sandy 
bridge E5-2670 processor chips (2.6 GHz) per node, 256 
Gb memory, and Linux operating system (Figure 2). 
Scalability refers to the stability of average perform- 
ance of a parallel program as the number of processors 
increases. Ideal scalability is achieved when the 



efficiency of k processor-cores (EiJ is E^ = S^/k = 1, 
where S^ = the ratio of the execution time with one 
processor-core to the execution time of the parallel algo- 
rithm with k processor-cores [20]. 

GREML_CE and GREML_QM each has three output 
files for results of GREML, GBLUP, and SNP effects and 
heritabilities, in addition to screen displays (Additional 
file 1: Supplementary output files). The GREML output 
files contain estimates and standard errors of variance 
components at each iteration, and the final estimates 
of variance components, heritabilities and their stand- 
ard errors. The GBLUP output file contains GBLUP 



Table 1 Computing time (seconds) using GREML_CE and GREML_QM for simulated datasets^ 




q = 1 000, n 


n = 3000 (Dataset_1) 


q = 3000, m = 


1000 (Dataset_2) 


GREiVlL_CE 


GREIVIL_QIV1 


GREIVIL_CE 


GREMLQM 


Time for SNP input, Ag and Dg 


1 


1 


1 


1 


Time per iteration 


-0.2 


6 


3 


-0.6 


Number of iteration 


10 


10 


7 


7 


Total time 


5 


69 


32 


6 



^The two programs were run on a personal computer (PC) with Intel Core 17-2600 (4 cores) of 3.40 GHz and memory of 8 Gb. 
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GREML CE 



GREML QM 
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Figure 2 Excellent scalability of shared memory parallel computing of GREIVIL_CE (left) and GREML_QM (right). 




airamosame 14 (Hb) Oiromosocne M (Hb) Chromosonit M (Mb) amKnosome M (Mb) 



Figure 3 Graphical viewing of SNP additive and dominance effects and heritabilities. A: Manhattan plot of the original GBLUP values of 
SNP additive effects. B: Chromosome 14 graph of the original GBLUP values of SNP additive and dominance effects. C: Manhattan plot of the 
absolute GBLUP values of SNP additive effects. D: Chromosome 14 graph of the absolute GBLUP values of SNP additive and dominance effects. 
E: Manhattan plot of SNP additive heritabilities in percentage scale. F: Chromosome 14 graph of SNP additive and dominance heritabilities in 
percentage scale. G: Manhattan plot of SNP additive heritabilities in logio scale. H: Chromosome 14 graph of SNP additive and dominance 
heritabilities in logio scale. Dominance GBLUP values were all virtually zero, consistent with the fact that the phenotypic values for fat percentage 
were PTA values of additive effects. The highly significant chromosome 14 region is the DGATl region, and the graphs of C-F are similar to those 
using stratification corrections reported in Ma et ol. [21]. The total additive heritability of SNP markers in the 1675278-4606904 Mb region of 
chromosome 14 that includes DGATl was 0.0248. Although additive heritabilities of other SNPs were much smaller than those in and near the 
DGATl region, those additive heritabilities were still considerably larger than dominance heritabilities, which were all virtually zero for all SNPs. 



Wang et al. BMC Bioinformatics 2014, 15:270 
http://www.bionnedcentral.conn/1471 -21 05/1 5/270 



Page 6 of 9 



Check 'original 
value' to use 
GBLUP values, 
replace 'original 
value' by the 
title for Yl, e.g., 
'Fat percentage 
PTA' in our 
example. 

Check '_A'and 
'_D' variables to 
use original 
GBLUP values, 
and check '_A2' 
and '_D2' 
variablesto use 
absoluteGBLUP 
values. 



Effect Setting 



Effects 


Total: 4 Yl: 


2 Y2: 0 


Yl Axrs 
log10(1/p) 


<§> Original Vblue 


Y2AXB 
loglO(l/p) 




Ongnal Value 



y0| — 



^ Effect Setting 



Load 'mrk_eff.snpe', which 
we renamed as 
'mrk_eff_fpc.snpe' 




Setting 



Save Graph(s) 



Current Graph 



All Graphs 



Figure 4 Procedure of using SNPEVG2 to generate grapiis and interactive graphical views. This procedure can be summarized as: 1) Open 
SNPEVG2, 2) Load tine 'marl<_effect.snpe' file using 'Browse' tab on tlie GUI of SNPEVG2, 3) click 'Setting' and check 'original value' for Yl axis, 4) 
change 'original value' to user defined title for Yl axis, 5) Click the button pointed by the green arrow to define pixel size and to select color 
template for the graphs, 6) Click 'run', 7) View the graph by scrolling up and down in the top right window, 8) Save 'All graphs' or 'Current graph'. 
SNPEVG2 is included in the SNPEVG package that is freely available at: http://animalgene.umn.edu/. 



of breeding values, dominance deviations, genotypic 
values, and the corresponding reliabilities for both 
training and validation populations. These GBLUP re- 
sults are calculated using the GREML estimates at 
the last iteration. Both GREML_CE and GREML_QM 
have a user option to output SNP additive and dom- 
inance marker effects and heritbilities for every SNP. 



The SNP effects and heritabilities can be readily 
graphed and displayed by SNPEVEG2 [18] including 
Manhattan plots and graphs by chromosome using 
the original SNP GBLUP values (Figure 3: A and B), 
or the absolute SNP GBLUP values (Figure 3: C and 
D), or SNP additive and dominance heritabilities in 
the scale of percentages (Figure 3: E and F), or SNP 



Table 2 Capacity and speed of GVCBLUP for genomic estimation of additive, dominance and residual variances 
(tolerance = 10"^) and ItascaSB supercomputer 





GREML_CE 


GREML_CE 


GREML_QM 


GREML_QM^ 


Number of individuals (q) 


20,000 


50,000 


200,000 


100,000 


Number of SNP markers (m) 


1 million 


400,000 


10,000 


50,000 


Time for SNP input, Ag and Dg 


3.7 hrs 


6.0 hrs 


14.9 min 


0.33 hrs 


Time per iteration 


3.1 min 


0.77 hrs 


1.5 min 


2.25 hrs 


Total time 


4.8 hrs 


23.2 hrs 


2 hrs 


-45.83 hrs 


Number of iteration 


12 


13 


20 


20 



^Computing time for calculating GBLUP reliabilities Is not Included. 
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Table 3 Comparison of iteration numbers of EM-REML 
and AI-REML (tolerance = 10~^) using simulated data with 
different heritability levels 



Replication 


hl = 0.0,hl 


= 0.0 


hl = 03,hl 


= 0.3 




EM-REML 


AI-REML 


EM-REML 


AI-REML 


1 


173 


1 


322 


9 


2 


231 




386 


12 


3 


348 




348 


9 


4 


359 




354 


8 


5 


481 


18 


458 


10 


6 


138 




295 


10 


7 


871 




416 


8 


8 


134 




353 


9 


9 


291 


16 


336 


12 


10 


1000 


1000^ 


431 


11 



'AI-REML failed. 



additive and dominance heritabilities in the logio scale 
(Figure 3: G and H). The procedure to generate the 
Manhattan plots and chromosome figures is shown in 
Figure 4. 

Numerical evaluations showed that the AI-REML al- 
gorithm for both GREML_CE and GREML_QM had fast 
convergence rate, requiring between 12-20 iterations to 
converge with a strict tolerance level of 10"^, compared 
to 295-458 iterations using EM-REML (Table 2). The 
SNP input and the calculation of genomic relationships 
matrices (Ag and Dg) required more computing time 
than per-iteration of the estimation step. GREML_CE 
was able to use 50,000 individuals with 400 K SNP markers 
with total computing time about 23 hours for 13 iterations. 
For 20,000 individuals and one million SNP markers, 
GREML_CE only required 4.8 hours. GREML_QM was 
highly efficient for using low-density SNP markers, requir- 
ing only 2 hours for 200,000 individuals with 10 K SNP 



GVCeasvfl 



The Graphical User Interface for GVCBLUP 
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Browse Chromosome fles 




* use exbtkig parameter fk 
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00 Days 00:00:05 



^ Graphical view of SNP effects 



Launch SNPEVG2 



SNPEVG2 
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J Create parameter fie or (s^ Edit parameter He 
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Figure 5 GVCeasy graphical user interface (GUI) for GVCBLUP. A: The main control of GVCeasy. Any of the three main programs may be launched 
from here and the same program may be opened multiple times. B: The GUI for GREML_CE with a tab to lunch SNPEVG2 to graph and view SNP additive and 
dominance effects. C: The GUI for GREML_QM with a tab to lunch SNPEVG2 to graph and view SNP additive and dominance effects. D: The GUI for GCORRMX. 
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markers. For 100,000 individuals with 50 K SNP markers, 
GREML_QM required about 46 hours for 20 iterations 
(Table 2). Although AI-REML was fast, extreme heritability 
levels (0 or 1) generally would cause failure of AI-REML. 
For eight of ten replications with null heritability, AI-REML 
failed, but the variance components still could be estimated 
with EM-REML (Table 3). AI-REML was successful for all 
ten replications with heritability of 0.3. 

In addition to the tests in Table 1 using the simulation 
datasets we provide with the GVCBLUP package, 
GREML_CE and GREML_QM programs were exten- 
sively evaluated using simulation data under various as- 
sumptions, and the GREML estimates were compared to 
the REML estimates of additive heritabilities of five traits 
using pedigree relationships in a publically available 
swine dataset of 3534 pigs with the 60 K SNP data [5]. 
GREML and GBLUP generally were able to capture small 
additive and dominance effects that each accounted for 
0.00005-0.0003 of the phenotypic variance and GREML 
was able to differentiate true additive and dominance her- 
itability levels [5]. The inclusion of dominance in the pre- 
diction model resulted in improved accuracy of genomic 
prediction [8], and the genomic models with additive and 
dominance effects were more accurate for the estimation 
of variance components than their pedigree-based coun- 
terparts [7]. In a study of trout propensity to migrate, 
genomic-predicted additive effects completely separated 
migratory and nonmigratory fish in the wild population 
with 95.5% additive heritability and 4.5% dominance 
heritability, whereas genomic-predicted dominance ef- 
fects achieved such complete separation in the dam- 
blocked population with 0% additive heritability and 
39.3% dominance heritability [22], showing the import- 
ance to account for the exact effect type in the predic- 
tion model. 

GCORRMX program 

The GCORRMX program is designed to calculate mea- 
sures of genomic similarities among individuals. This pro- 
gram currently calculates the and matrices for six 
definitions [23]. An example of the GCORRMX output files 
is given in Additional file 1: Supplementary output files. 

GVCeasy: Graphical user interface (GUI) for GVCBLUP 

The three main programs of GVCBLUP are command 
line programs. GVCeasy is a Java program developed as 
a user-friendly GUI with a capability to run GVCBLUP 
by mouse clicks, providing considerable convenience for 
users not familiar with command line operations. 
GVCeasy can lunch any of the three main programs of 
GVCBLUP and provides a capability of editing the para- 
meter file for each main program (Figure 5). In addition, 
SNPEVG2 can be launched from the GREML_CE or 
GREML_QM window of GVCeasy for graphical viewing of 



SNP additive and dominance effects. To run GVCeasy, 
the programs of GVCeasy, GREML_CE, GREML_QM, 
GCORRMX and the SNPEVG package that includes 
SNPEVG2 need to be placed in the same directory. 
GVCeasy is applicable to Windows, Linux and Mac OS X 
versions of GVCBLUP. 

Conclusions 

The GVCBLUP package is a powerful and user friendly 
computing tool for assessing the type and magnitude of 
genetic effects affecting a phenotype by estimating whole- 
genome additive and dominance heritabilities of a pheno- 
type using genome-wide SNP markers, is a full featured 
computing tool for genomic prediction of breeding values, 
dominance deviations and genotypic values for both train- 
ing and validation data sets, and provides an important 
computing utility for research and education in the area of 
genomic prediction and estimation. 

Availability and requirements 

Project name: GVCBLUP 

Project home page: http://animalgene.umn.edu/ 
Operating system(s): Windows, Linux and Mac OS X 
Programming language: C++, Java 
License: None 

Additional file 



Additional file 1: Supplementary output files. 



Abbreviations 

SNP: Single nucleotide polymorphism; BLUP; Best unbiased linear prediction; 
GBLUP: Genomic BLUP; REML: Restricted maximum likelihood estimation; 
GREML: Genomic REML; EM: Expectation-maximization; AI-REML: Average 
information REML; GUI: Graphical user interface; MME: Mixed model 
equations. 

Competing interests 

The authors declare that they have no competing interests. 
Authors' contributions 

CW is the author of Versions 2.1-2.2 and 3.1-3.9 using shared memory 
parallel computing, and initiated and implemented the AI-REML algorithm. 
HBR initiated the use of the MKL libraries, and DP and HBR implemented the 
Linux Versions 3.1 and 3.2 using MKL. DP performed the testing of the Mac 
OS X version of GVCBLUP 3.9. SW is the author of Version 1.1 of GVCBLUP 
using serial computing, and CW and SW conducted simulation studies to 
evaluate GVCBLUP. SP is the author of GVCeasy 1.1 and 1.2. CW and DP are 
the authors of GVCeasy 1.3. YD is the project leader and the lead writer of 
the manuscript All authors read and approved the final manuscript. 

Acknowledgements 

This research was supported by USDA National Institute of Food and 
Agriculture Grant no. 201 1-67015-30333 and by project MN-1 6-043 of the 
Agricultural Experiment Station at the University of Minnesota. Supercomputer 
computing time was provided by the Minnesota Supercomputer Institute at 
the University of Minnesota and by the Research Computing Center at The 
University of Chicago. 



Wang et al. BMC Bioinformatics 2014, 15:270 
http://www.bionnedcentral.conn/1471-2105/15/270 



Page 9 of 9 



Author details 

^Department of Animal Science, University of Minnesota, Saint Paul, MN 
55108, USA. ^Research Computing Center, The University of Chicago, 
Chicago, IL 60637, USA. 

Received: 12 February 2014 Accepted: 30 July 2014 
Published: 9 August 2014 



References 

1. Meuwissen THE, Hayes BJ, Goddard ME: Prediction of total genetic value 
using genome-wide dense marker maps. Genetics 2001, 157(4):1819-1829. 

2. VanRaden P: Efficient methods to compute genomic predictions. J Dairy 
Sc/ 2008, 91(11 ):441 4-4423. 

3. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden 
PA, Heath AC, Martin NG, Montgomery GW: Common SNPs explain a large 
proportion of the heritability for human height. Nat Genet 2010, 
42(7):565-569. 

4. Falconer DS, Mackay TFC: Introduction to Quantitative Genetics. 4th edition. 
Harlow, Essex, UK: Longmans Green; 1996. 

5. Da Y, Wang C, Wang S, Hu G: Mixed model methods for genomic 
prediction and variance component estimation of additive and 
dominance effects using SNP markers. PLoS One 2014, 9(l):e87666. 

6. Hu G, Wang C, Da Y: Genomic heritability estimation for the early life- 
history transition related to propensity to migrate in wild rainbow and 
steelhead trout populations. Ecol Evol 2014. doi:101002/ece31038. 

7. Vitezica ZG, Varona L, Legarra A: On the additive and dominant variance 
and covariance of individuals within the genomic selection scope. 
Genetics 2013, 195(4):1 223-1 230. 

8. Nishio M, Satoh M: Including dominance effects in the genomic BLUP 
method for genomic evaluation. PLoS One 2014, 9(l):e85792. 

9. Sun C, VanRaden P, O'Connell J, Weigel K, Gianola D: Mating programs 
including genomic relationships and dominance effects. J Dairy Sci 2013, 
96(1 2):801 4-8023. 

10. Yang J, Lee SH, Goddard ME, Visscher PM: GCTA: a tool for genome-wide 
complex trait analysis. Ann J Hum Genet 201 1, 88(l):76-82. 

11. Perez P, de Los CG, Crossa J, Gianola D: Genomic-enabled prediction 
based on molecular markers and pedigree using the Bayesian linear 
regression package in R. Plant Genome 2010, 3(2):106-1 16. 

12. Fernando R, Garrick D: GenSel-User Manual for a Portfolio of Genomic 
Selection Related Analyses. Ames: Animal Breeding and Genetics, Iowa State 
University; 2008 [http://taurus.ansci.iastate.edu/^ 

13. Su G, Christensen OF, Ostersen T, Henryon M, Lund MS: Estimating additive 
and non-additive genetic variances and predicting genetic merits using 
genome-wide dense single nucleotide polymorphism markers. PLoS One 
2012, 7(9):e45293. 

14. Aguilar I, Misztal I, Johnson DL, Legarra A, Tsuruta S, Lawlor TJ: Hot topic: a 
unified approach to utilize phenotypic, full pedigree, and genomic 
information for genetic evaluation of Holstein final score. J Dairy Sci 2010, 
93(2):743-752. 

15. Da Y, Wang S: Joint genomic prediction and estimation of variance 
components of additive and dominance effects using SNP markers. 
Abstract PI 004. Plant and Animal Genome XXI, January 12-16, 2013. San 
Diego. In [https://pag.confex.com/pag/xxi/webprogram/Paper7396.html] 

16. Eigen V3. In [http://eigen.tuxfamily.org] 

1 7. Intel Math Kernel Library Reference Manual. In Doc. No. 63081 3-061 US, 
MKL 11.0, update 5. [http://download-software.intel.com/sites/products/ 
documentation/doclib/mkl_sa/l 1/mklman/mklman.pdfl 

18. Wang S, Dvorkin D, Da Y: SNPEVG: a graphical tool for GWAS graphing 
with mouse clicks. BMC Bioinformatics 2012, 13(1):319. 

19. Cleveland MA, Hickey JM, Forni S: A common dataset for genomic analysis 
of livestock populations. G3: Genes] Genomes\Genetics 2012, 2(4):429-435. 

20. Ma L, Runesha HB, Dvorkin D, Garbe J, Da Y: Parallel and serial computing 
tools for testing single-locus and epistatic SNP effects of quantitative 
traits in genome-wide association studies. BMC Bioinformatics 2008, 
9(1):315. 

21 . Ma L, Wiggans G, Wang S, Sonstegard T, Yang J, Crooker B, Cole J, Van 
Tassell C, Lawlor T, Da Y: Effect of sample stratification on dairy GWAS 
results. BMC Genomics 2012, 13(1):536. 



22. Hu G, Wang C, Da Y: Genomic heritability estimation for the early 
life-history transition related to propensity to migrate in wild rainbow 
and steelhead trout populations. Ecology Evol 2014, 4(8):1 381 -1388. 

23. Wang C, Prakapenka D, Wang S, Runesha HB, Da Y: GVCBLUP: a computer 
package for genomic prediction and variance component estimation of 
additive and dominance effects using SNP markers. Version 3.3. In 
Department of Animal Science, University of Minnesota. 2013. 



doi:10.1 186/1471-2105-15-270 

Cite this article as: Wang et al.: GVCBLUP: a computer package for 
genomic prediction and variance component estimation of additive and 
dominance effects. BMC Bioinformatics 2014 15:270. 



Submit your next manuscript to BioMed Central 
and take full advantage of: 

• Convenient online submission 

• Thorough peer review 

• No space constraints or color figure charges 

• Immediate publication on acceptance 

• Inclusion in PubMed, CAS, Scopus and Google Scholar 

• Research which is freely available for redistribution 



Submit your manuscript at /^\ r=„«|,-,j rpntral 

www.biomedcentral.com/submit momea L.enTrai 



